# 4 - Encyclopedic sequences.
Is your phone number somewhere in some prime number as a substring? How about in some power of 2? For example, if your phone number is 5551234, then it is part of the prime number 55512340001 ! But is 5551234 somewhere in some power of 2? In fact, looking at the powers of 2: 1, 2, 4, 8, 16, 32, 64, 128, 512, 1024, 2048, 4096, 8192, 16384, ... Does 7 ever show up as the first digit? More or less often than 8? How about 9?
Note that this question is base dependent. Since the powers of 2 in base 2 are of the form 1, 10, 100, 1000, 10000, 100000, ..., which we can see it does not contain every possible string of 0 and 1.
Let us say a positive integer sequence $(a_{n})$ is **encyclopedic in base $b$** if for every finite word $w$ in base $b$ alphabets $\{0,1,2,3,\ldots,b-1\}$ shows up as a consecutive substring somewhere in some $a_{n}$ in base $b$. And we say $(a_{n})$ is **universally encyclopedic** if it is encyclopedic in all bases $b \ge 2$.
So the question becomes, are the sequence of prime numbers encyclopedic in some base $b$? Or perhaps universally encyclopedic? How about the powers of 2?
## Some observations.
Are there any sequences that are universally encyclopedic?
Yes, the naturals, $a_{n}=n$. Ok, but what about them that makes it encyclopedic?
A reason why $a_{n}=n$ is universally encyclopedic is because it grows slowly, so it is able to go through all the words in any base $b$. In fact, if we imagine $a_{n}$ grows at a rate slow enough, then the leading (left most) digits of $a_{n}$ would increment slowly. One situation this can happen is if $a_{n}$ grows **subarithmetically**, where there exists some $C$ such that $$
a_{n+1} - a_{n} \le C
$$for all $n$. If $a_{n}$ is strictly increasing, then by adding at most $C$ each time changes only right most few digits of $a_{n}$. In particular, as there is some $r$ where $b^{r} > C$, this shows only the right most $r$ places change each time $C$ is added, and the $(r+1)$-th digit from the right can only increment by at most $1$. Hence if we look at the places to the left of the $(r+1)$-th place, the sequence $a_{n}$ would go through all the words in base $b$ one by one!
So we have a preliminary result.
> **Proposition.** If $(a_{n})$ is an increasing integer sequence that is subarithmetic, then $(a_{n})$ is universally encyclopedic.
However, the sequence of primes, $p_{n} = \text{\(n\)-th prime}$, is not subarithmetic. The prime gap $g_{n}=p_{n+1} - p_{n}$ in fact has $\sup g_{n} = \infty$!
Why is this? Consider the following $N$ consecutive integers $(N+1)! + 2$, $(N+1)! +3$,$\ldots$, $(N+1)! + (N+1)$, which are all composites and has no primes in them. So for each $N$ there exists a prime gap of at least $N$. Hence $\sup g_{n} =\infty$, and the primes are not subarithmetic.
However, we do know the growth of the prime numbers asymptotically, by the celebrated **prime number theorem**,
> **Prime number theorem.** $p_{n} \sim n \log n$.
Here we write $f(n) \sim g(n)$ to mean $\lim_{n\to \infty} \frac{f(n)}{g(n)}=1$. It is an amazing feat that PNT is deduced. Perhaps we can use this to our advantage, that the primes grows no faster than a quadratic function?
Let us revisit the subarithmetic case again, $0\le a_{n+1}-a_{n} < C$ when $a_{n}$ strictly increase. Then we see that this implies $$
0\le \frac{a_{n+1}}{a_{n}}-1 < \frac{C}{a_{n}}.
$$By squeeze theorem, this gives $$
\lim_{n\to \infty} \frac{a_{n+1}}{a_{n}}=1.
$$
So a strictly subarithmetic sequence $(a_{n})$ implies $a_{n+1}\sim a_{n}$. Perhaps this is a condition that would also give universally encyclopedic behavior?
Indeed this is the case!
## A main result.
We claim
> If $(a_{n})$ is a strictly increasing integer sequence such that $a_{n+1}\sim a_{n}$, then it is universally encyclopedic.
Proof.
Take any word $w$ in alphabets base $b$. Without loss, $w$ does not start with a $0$, otherwise we can just pad it to start with a $1$. And let $m \in \mathbf{N}$ be the integer such that $m$ in base $b$ is $w$.
Now, to show that some term $a_{n}$ that has the left most digits to be $w$, it is to say that $$
m b^{k} \le a_{n} <(m+1)b^{k}
$$for some $k$, and some $n$. If we can do this, then we win!
Equivalently, this is $$
m \le \frac{a_{n}}{b^{k}} N$. When this happens we have $$
\frac{a_{n(k)+1}}{a_{n(k)}} < \frac{m+1}{m},
$$so $$
a_{n(k)+1} < \frac{m+1}{m} a_{n(k)}.
$$But using inequalities in $(\dagger)$, we get $$
mb^{k} \le a_{n(k)+1} < \frac{m+1}{m} a_{n(k)} < \frac{m+1}{m} m b^{k} = (m+1)b^{k}
$$as desired! $\blacksquare$
Now, note, since the primes $p_{n}\sim n \log n$, we have $p_{n+1} \sim p_{n}$ by L'Hospital rule. So the prime sequence is universally encyclopedic!